pacman::p_load(sf, tidyverse, spdep, tmap, funModeling)Take Home Exercise 1 - Investigation of Water points in Nigeria
Overview
Water is a crucial resource for humanity. People must have access to clean water in order to be healthy. It promotes a healthy environment, peace and security, and a sustainable economy. However, more than 40% of the world’s population lacks access to enough clean water. According to UN-Water, 1.8 billion people would live in places with a complete water shortage by 2025. One of the many areas that the water problem gravely threatens is food security. Agriculture uses over 70% of the freshwater that is present on Earth.
The severe water shortages and water quality issues are seen in underdeveloped countries. Up to 80% of infections in developing nations are attributed to inadequate water and sanitation infrastructure.
Despite technological advancement, providing rural people with clean water continues to be a key development concern in many countries around the world, especially in those on the continent of Africa.
The spatial patterns of non-functional water points will be shown in this study by using the proper global and local spatial association methodologies. We look at Nigeria’s in this assignment (LGA)
Getting Started
First, the required packages are loaded into the R environment . The required packages are sf, tidyverse, spdep, tmap, & funModeling
with the code below:
Spatial Data
The spatial dataset used in this assignment is the Nigeria Level-2 Administrative Boundary spatial dataset downloaded from Center for Humanitarian Data - Nigeria - Subnational Administrative Boundaries
We will load the spatial features by using st_read() from the sf package
As the data we want is in WSG-84 format, we set crs to 4326.
We won’t utilize st transform() at this time because it can result in outputs with missing points after transformation, which would skew our study.
nga = st_read(dsn = "data/geospatial",
layer = "nga_admbnda_adm2_osgof_20190417",
crs = 4326)Reading layer `nga_admbnda_adm2_osgof_20190417' from data source
`D:\Allanckw\ISSS624\Take-Home_Ex1\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 774 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS: WGS 84
#nigeria_lga_sf = st_transform(nigeria_lga_sf, crs=4326) cause missing pointsWe could use st_crs()to verify the coordinate system from the object.
st_crs(nga)Coordinate Reference System:
User input: EPSG:4326
wkt:
GEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
Before we start analyzing the data, lets us take a look at some characteristics of the spatial features to have a sense of what we are dealing with. We can use glimpse() to determine to accomplish that
glimpse(nga)Rows: 774
Columns: 17
$ Shape_Leng <dbl> 0.2370744, 0.2624772, 3.0753158, 2.5379842, 0.6871498, 1.06…
$ Shape_Area <dbl> 0.0015239210, 0.0035311037, 0.3268678399, 0.0683785064, 0.0…
$ ADM2_EN <chr> "Aba North", "Aba South", "Abadam", "Abaji", "Abak", "Abaka…
$ ADM2_PCODE <chr> "NG001001", "NG001002", "NG008001", "NG015001", "NG003001",…
$ ADM2_REF <chr> "Aba North", "Aba South", "Abadam", "Abaji", "Abak", "Abaka…
$ ADM2ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM2ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM1_EN <chr> "Abia", "Abia", "Borno", "Federal Capital Territory", "Akwa…
$ ADM1_PCODE <chr> "NG001", "NG001", "NG008", "NG015", "NG003", "NG011", "NG02…
$ ADM0_EN <chr> "Nigeria", "Nigeria", "Nigeria", "Nigeria", "Nigeria", "Nig…
$ ADM0_PCODE <chr> "NG", "NG", "NG", "NG", "NG", "NG", "NG", "NG", "NG", "NG",…
$ date <date> 2016-11-29, 2016-11-29, 2016-11-29, 2016-11-29, 2016-11-29…
$ validOn <date> 2019-04-17, 2019-04-17, 2019-04-17, 2019-04-17, 2019-04-17…
$ validTo <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ SD_EN <chr> "Abia South", "Abia South", "Borno North", "Federal Capital…
$ SD_PCODE <chr> "NG00103", "NG00103", "NG00802", "NG01501", "NG00302", "NG0…
$ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((7.401109 5...., MULTIPOLYGON (…
We can use freq() of the funModeling package to display the distribution of Level 1 administration (Which are states in Nigeria) instead and only zooming in on the micro level when we perform the water point analysis.
freq(data=nga, input = 'ADM1_EN')
ADM1_EN frequency percentage cumulative_perc
1 Kano 44 5.68 5.68
2 Katsina 34 4.39 10.07
3 Oyo 33 4.26 14.33
4 Akwa Ibom 31 4.01 18.34
5 Osun 30 3.88 22.22
6 Borno 27 3.49 25.71
7 Imo 27 3.49 29.20
8 Jigawa 27 3.49 32.69
9 Delta 25 3.23 35.92
10 Niger 25 3.23 39.15
11 Benue 23 2.97 42.12
12 Kaduna 23 2.97 45.09
13 Rivers 23 2.97 48.06
14 Sokoto 23 2.97 51.03
15 Adamawa 21 2.71 53.74
16 Anambra 21 2.71 56.45
17 Kebbi 21 2.71 59.16
18 Kogi 21 2.71 61.87
19 Bauchi 20 2.58 64.45
20 Lagos 20 2.58 67.03
21 Ogun 20 2.58 69.61
22 Cross River 18 2.33 71.94
23 Edo 18 2.33 74.27
24 Ondo 18 2.33 76.60
25 Abia 17 2.20 78.80
26 Enugu 17 2.20 81.00
27 Plateau 17 2.20 83.20
28 Yobe 17 2.20 85.40
29 Ekiti 16 2.07 87.47
30 Kwara 16 2.07 89.54
31 Taraba 16 2.07 91.61
32 Zamfara 14 1.81 93.42
33 Ebonyi 13 1.68 95.10
34 Nasarawa 13 1.68 96.78
35 Gombe 11 1.42 98.20
36 Bayelsa 8 1.03 99.23
37 Federal Capital Territory 6 0.78 100.00
774 Local Government Areas (LGA) make up Nigeria’s 37 states, with Kano having the most LGAs overall.
For a meaningful analysis, there are just too many LGAs, both large and little.
Calling ttm() in the tmap package will switch the tmap’s viewing mode to interactive viewing, which will help us better visualize the map. Without this change, the tmap will be too small for any type of analysis. Additionally, we’ll base the map’s plot on States (Level 1 Administration Area)
Given that there are 37 states, we must raise the maximum number of categories from the default value of 30 to 37. Using tmap_options(max.categories = 37), the threshold can be set.
ttm()
tmap_options(max.categories = 37)Now, we are ready to build our map with the functions in the tmap package
tm_shape(nga) +
tm_polygons("ADM1_EN") +
tm_borders(alpha=0.5) +
tm_scale_bar() +
tm_grid (alpha=0.2) +
tm_layout(main.title="Map of Nigeria LGA",
main.title.position="center",
main.title.size=1.2,
legend.height = 0.35,
legend.width = 0.35,
frame = TRUE) Aspatial Data
Cleaning the Data
The aspatial dataset used in this assignment is the water point data exchange dataset found in WPdx Global Data Repositories. Data is filtered on the web portal to only keep Nigeria and the file is saved as NigeriaWaterPoints_Raw.csv
As we are only interested in the functionality of the water point, it is important to capture fields that may affect the functionality
LGA: The area we are interested in
State: The state of the LGA of Nigeria
Functional: Whether it is functional or not
management: who manages it?
Quality: what is the quality?
Water Source Category: where the water came from?
Water Tech Category: What technology is used?
latitude
longitude
To load the raw data file, we use the read_csv function
wpdx_raw = read_csv("data/aspatial/NigeriaWaterPoints_Raw.csv") Most of the columns are irrelevant, so we will perform the following:
keep the columns we want to clean it up by specifying the columns with one to retain with
subsetrenaming the columns using
rename_withReplace all the NA with unknown for columns with NA value present
retain_cols <- c('#clean_adm2', '#clean_adm1', '#status_clean', '#management_clean', '#subjective_quality', '#fecal_coliform_presence', '#water_source_category', '#water_tech_category', '#lat_deg', '#lon_deg' )
new_col_names <- c('LGA', 'State', 'Functional', 'Management', 'Quality', 'presence_of_fecal_coliform', 'Water_Source_Category', 'Water_Tech_Category', 'latitude', 'longitude')
wpdx_clean = subset(wpdx_raw, select = (names(wpdx_raw) %in% retain_cols)) %>% rename_with(~ new_col_names, all_of(retain_cols)) %>%
replace_na(list(Functional = "Unknown", Management = "Unknown", Quality = "Unknown", Water_Source_Category = "Unknown", Water_Tech_Category = "Unknown"))We save the clean file with saveRDS(), the file will be reduced to 1.6MB from the 144MB raw file that we downloaded.
saveRDS(wpdx_clean, "data/aspatial/wpdx_clean.rds")We can then delete the raw file from the project and retrieve the saved RDS file using readRDS()
wpdx_clean = readRDS("data/aspatial/wpdx_clean.rds")Converting csv data into spatial features
We can use st_as_sfto create a dataframe from the longitude (x) and latitude (y) values. The EPSG 4326 code is used as the dataset is referencing WGS84 geographic coordinate system. We could use st_crs()to verify the coordinate system from the object.
wpdx_clean_sf = st_as_sf(wpdx_clean, coords = c("longitude", "latitude"), crs=4326)
st_crs(wpdx_clean_sf)Coordinate Reference System:
User input: EPSG:4326
wkt:
GEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
We can then use glimpse() to verify each field’s data type & available values.
There are 95, 008 water points in the LGAs. The results also shows that the longitude and latitude values have been converted to a geometry object consisting of the longitude and latitude values as points, with both columns now dropped.
glimpse(wpdx_clean_sf)Rows: 95,008
Columns: 8
$ Water_Source_Category <chr> "Unknown", "Well", "Well", "Well", "Well", "Well…
$ Water_Tech_Category <chr> "Tapstand", "Mechanized Pump", "Hand Pump", "Unk…
$ State <chr> "Ekiti", "Ogun", "Ebonyi", "Enugu", "Enugu", "Be…
$ LGA <chr> "Moba", "Obafemi-Owode", "Ohaukwu", "Isi-Uzo", "…
$ Management <chr> "Unknown", "Other", "Unknown", "Unknown", "Unkno…
$ Functional <chr> "Unknown", "Functional", "Unknown", "Unknown", "…
$ Quality <chr> "Unknown", "Acceptable quality", "Unknown", "Unk…
$ geometry <POINT [°]> POINT (5.12 7.98), POINT (3.597668 6.96453…
Aggregate the Data
We can use freq() of the funModeling package to display the distribution of functional field in wpdx_clean_sf. This is to help us aggregate the data as the dataset provide breakdowns of functional status. In order to only look at non functional water points, we will need to aggregate the different categories into simply functional, non functional and unknowns.
freq(data=wpdx_clean_sf, input = 'Functional')
Functional frequency percentage cumulative_perc
1 Functional 45883 48.29 48.29
2 Non-Functional 29385 30.93 79.22
3 Unknown 10656 11.22 90.44
4 Functional but needs repair 4579 4.82 95.26
5 Non-Functional due to dry season 2403 2.53 97.79
6 Functional but not in use 1686 1.77 99.56
7 Abandoned/Decommissioned 234 0.25 99.81
8 Abandoned 175 0.18 99.99
9 Non functional due to dry season 7 0.01 100.00
To aggregate them into functional, non functional and unknown, we will create new data frames to store them by using the filter function
func_list = c("Functional", "Functional but needs repair", "Functional but not in use")
wpt_functional = wpdx_clean_sf %>%
filter(Functional %in% func_list)
wpt_non_functional = wpdx_clean_sf %>%
filter(!Functional %in% c(func_list, "Unknown"))
wpt_unknown = wpdx_clean_sf %>%
filter(Functional %in% "Unknown")Out of the 32, 204, records, we can gain some insights on why it might be non functional, is it due to management? Is it due to technology? Is it due to the source of the water?
Similarly, like how we aggregate functional data points, we could use freq() of the funModeling package to find out
freq(data=wpt_non_functional, input = 'Management')
Management frequency percentage cumulative_perc
1 Unknown 17617 54.70 54.70
2 Community Management 8249 25.61 80.31
3 Direct Government Operation 3831 11.90 92.21
4 Other 1941 6.03 98.24
5 School Management 397 1.23 99.47
6 Health Care Facility 128 0.40 99.87
7 Other Institutional Management 25 0.08 99.95
8 Private Operator/Delegated Management 16 0.05 100.00
freq(data=wpt_non_functional, input = 'Water_Tech_Category')
Water_Tech_Category frequency percentage cumulative_perc
1 Hand Pump 20471 63.57 63.57
2 Mechanized Pump 11532 35.81 99.38
3 Unknown 169 0.52 99.90
4 Tapstand 32 0.10 100.00
freq(data=wpt_non_functional, input = 'Water_Source_Category')
Water_Source_Category frequency percentage cumulative_perc
1 Well 31470 97.72 97.72
2 Spring 733 2.28 100.00
3 Piped Water 1 0.00 100.00
From the results, we can conclude that
More than half of the non functional water points have an unknown management, we could ask if these water points are even managed.
Most of the non functional water points uses pumps, we could ask the question if there is an issue with the pumps and if there is a lack of expertise to repair or replace them when they fail.
97.72% of such non functional water points are made up of wells.
Combining Spatial & Aspatial Data
We can use st_intersects() to find common data points between geographical datasets. In our case we need to find the common points in the Nigeria’s LGA spatial dataset and the water point aspatial dataset
The below code does 4 things
It intersects the Nigeria LGA dataset (nga dataframe) with the water point dataset (wpdx_clean_sf dataframe) and produce a new column to denote the total number of water points in the area (Total wpt) by using
mutate()andlengths()Similar to step 1, the result of step 1 is piped to add 3 columns to denote the number of functional, non functional and unknown water points in the area to produce wpt functional, wpt non functional and wpt unknown respectively
We also add 2 new columns to find the percentage of functional and non functional water points by using
mutate()Select appropriate columns required which are the LGA area and LGA code (Column 3 & 4), Administration Level 1 Area and Administration Level 1 Code (Column 8 & 9) which represent states, the columns that were added as explained in steps 2 & 3 and the geometry multipolygon objects (Column 18 to 23) using
select()
nga_wp <- nga %>%
#combine nga with water point sf
mutate(`total wpt` = lengths(
st_intersects(nga, wpdx_clean_sf))) %>%
#add columns to produce no. of functional, non functional and unknown points
mutate(`wpt functional` = lengths(
st_intersects(nga, wpt_functional))) %>%
mutate(`wpt non functional` = lengths(
st_intersects(nga, wpt_non_functional))) %>%
mutate(`wpt unknown` = lengths(
st_intersects(nga, wpt_unknown))) %>%
#add columns to compute %
mutate(pct_functional = `wpt functional`/`total wpt`) %>%
mutate(`pct_non-functional` = `wpt non functional`/`total wpt`) %>%
select(3:4, 8:9, 18:23)We did not adjust the projection from WSG84 in the earlier part because we needed to perform st_transform(). Since the spatial feature data frame has been fully constructed, we can now use st_transform() to update the appropriate projection system. The Nigeria Mid Belt Coordinate System (26392) will be applied, and st crs() is used to confirm the transformation was completed.
wpdx_clean_sf = st_transform(wpdx_clean_sf, crs=26392)
st_crs(wpdx_clean_sf)Coordinate Reference System:
User input: EPSG:26392
wkt:
PROJCRS["Minna / Nigeria Mid Belt",
BASEGEOGCRS["Minna",
DATUM["Minna",
ELLIPSOID["Clarke 1880 (RGS)",6378249.145,293.465,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4263]],
CONVERSION["Nigeria Mid Belt",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",4,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",8.5,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.99975,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",670553.98,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Engineering survey, topographic mapping."],
AREA["Nigeria between 6°30'E and 10°30'E, onshore and offshore shelf."],
BBOX[3.57,6.5,13.53,10.51]],
ID["EPSG",26392]]
Visualizing the spatial distribution of water points
We could use breaks of the summary statistics by using percentiles, this is to help us find out the distribution of water points in each quantile.
#summary(nga_wp$`total wpt`)
#summary(nga_wp$`wpt functional`)
summary(nga_wp$`wpt non functional`) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 12.25 34.00 41.60 60.75 278.00
#summary(nga_wp$`wpt unknown`)It is recommended not to use the default style with breaks as quantile since the range from the third quantile to the maximum is too wide and could result in a skewed representation. We compute the variance and standard deviation of non-functional water points first to better understand our dataset since we now need to decide which style is appropriate for the map.
var(nga_wp$`wpt non functional`)[1] 1376.914
sd(nga_wp$`wpt non functional`) [1] 37.10679
It appears that this dataset has a very large variance. Since the variance is so high, we would like to lower it. Using the kmeans style is one method to do this. n = 6 is choosen as after some experimentation, it appears that 6 is the optimal number of clusters.
Functions from the tmap packages is used to produce the map
First we use tm_shape() + tm_fill("ADM1_EN") to form Layer 1 of the map to form the 37 states of the map. The Pastel1 palette is used because it is difficult to read different shades of the same two to three colors; Pastel1 has more colors, making states more distinct.
Next we use tm_shape() + tm_fill("wpt non functional") to form Layer 2 of the map which are the non functional water points. The palette used in this case is Purple Red such that areas with very little water points are shaded with a very light colour.
We may switch between layers on the interactive map to superimpose the nonfunctional water locations. With so many polygons, putting it side by side can be challenging to interpret.
tm_shape(nga) +
tm_fill("ADM1_EN", palette = "Pastel1") +
tm_borders(alpha=0.5) +
tm_grid (alpha=0.2) +
tm_shape(nga_wp) +
tm_fill("wpt non functional",
palette ="PuRd", style="kmeans", n=6) +
tm_borders(alpha=0.5) +
tm_grid (alpha=0.2) +
tm_layout(main.title="non functional WP - 2 Layer map",
main.title.position="center",
main.title.size=1.2,
#legend.height = 0.35,
#legend.width = 0.35,
frame = TRUE) Using dplyr package, we can summarize find out which States has the most number non functional water points and which are the states that has the most number of LGAs by using the functions group_by, summarise and arrange
#Sum of non functional water points
nga_wp %>%
group_by(ADM1_EN) %>%
summarise(NF_Frequency = sum(`wpt non functional`),
#F_Frequency = sum(`wpt functional`),
Total_Freq = sum(`total wpt`),
NF_Ratio = (NF_Frequency / Total_Freq) * 100
) %>%
arrange(desc(NF_Frequency))Simple feature collection with 37 features and 4 fields
Geometry type: GEOMETRY
Dimension: XY
Bounding box: xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS: WGS 84
# A tibble: 37 × 5
ADM1_EN NF_Frequency Total_Freq NF_Ratio geometry
<chr> <int> <int> <dbl> <GEOMETRY [°]>
1 Osun 2118 5519 38.4 POLYGON ((4.910021 7.841812, 4.…
2 Kaduna 1912 4925 38.8 POLYGON ((8.273597 11.30846, 8.…
3 Kwara 1634 3531 46.3 POLYGON ((4.876071 9.157646, 4.…
4 Kano 1570 7125 22.0 POLYGON ((8.727456 12.21461, 8.…
5 Ondo 1552 2575 60.3 POLYGON ((5.937628 7.648777, 5.…
6 Katsina 1521 5465 27.8 POLYGON ((8.3992 13.0758, 8.390…
7 Jigawa 1517 9696 15.6 POLYGON ((8.399011 12.82706, 8.…
8 Cross River 1446 3492 41.4 MULTIPOLYGON (((8.818036 5.6935…
9 Plateau 1332 4701 28.3 POLYGON ((8.820398 10.38392, 8.…
10 Oyo 1329 4085 32.5 POLYGON ((4.024729 7.664918, 4.…
# … with 27 more rows
#sum of LGAs by states
nga_wp %>%
group_by(ADM1_EN) %>%
summarise(count = n())%>%
arrange(desc(count))Simple feature collection with 37 features and 2 fields
Geometry type: GEOMETRY
Dimension: XY
Bounding box: xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS: WGS 84
# A tibble: 37 × 3
ADM1_EN count geometry
<chr> <int> <GEOMETRY [°]>
1 Kano 44 POLYGON ((8.727456 12.21461, 8.72491 12.21429, 8.721373 12.2…
2 Katsina 34 POLYGON ((8.3992 13.0758, 8.39043 13.08745, 8.38292 13.09174…
3 Oyo 33 POLYGON ((4.024729 7.664918, 4.039038 7.684978, 4.056281 7.6…
4 Akwa Ibom 31 MULTIPOLYGON (((7.530807 5.150259, 7.531415 5.146801, 7.5325…
5 Osun 30 POLYGON ((4.910021 7.841812, 4.911101 7.85011, 4.914135 7.85…
6 Borno 27 POLYGON ((14.58718 11.75277, 14.58861 11.75334, 14.59292 11.…
7 Imo 27 POLYGON ((7.422786 5.583626, 7.425965 5.585613, 7.426281 5.5…
8 Jigawa 27 POLYGON ((8.399011 12.82706, 8.390081 12.82528, 8.38304 12.8…
9 Delta 25 POLYGON ((5.985599 5.124185, 5.99217 5.117613, 5.995821 5.11…
10 Niger 25 POLYGON ((7.250727 10.03942, 7.240785 10.03998, 7.232374 10.…
# … with 27 more rows
Observations
According to the statistics, Osun has the highest number of non-operational water points - 2118 of them among the 37 states, followed by Kaduna (1912 water points) and Kwara (1634 water points).
Kano, despite being the State with the most number of LGAs (44), has only 1570 non functional water points (Ranked 4th) as compared to Osun that only comprises of 30 LGAs (Ranked 1st).
In contrast to Kaduna & Kwara, which are greater in size, Osun has 5519 water points, which is an interesting fact. In addition, nearly half of the water points in Kwara are not working.
Despite having a larger territory, Ondo, the state directly south-east of Osun, has over 60% of its water points that are not operational.
The south-eastern and western regions of Nigeria appear to be the hotspots for the spread of inoperative water points.
There are no non-functional water points on Nigeria’s north-eastern coast. Using the tmap package, we plot the functional map to see if there are any water points in the region or if there are none at all.
This can assist us in figuring out whether the region in the north-east is succeeding in a way that can be transferred to other parts of the nation, or whether it is uninhabited or underdeveloped.
tm_shape(nga_wp) +
tm_fill("wpt functional",
palette ="PuRd", style="kmeans", n=6) +
tm_borders(alpha=0.5) +
tm_grid (alpha=0.2) +
tm_layout(main.title="non functional WP - 2 Layer map",
main.title.position="center",
main.title.size=1.2,
#legend.height = 0.35,
#legend.width = 0.35,
frame = TRUE) The north-eastern region of Nigeria has few to no water points, which suggests to us that it may be that the region is underdeveloped or uninhabited.
Spatially Constrained Cluster Analysis
The Null Hypothesis of Local Moran’s I Statistics
The null hypothesis of Local Moran’s I statistics is that there is no correlation between the value at one site and the values at other locations close by. (Long, n.d.)
Reference
Kassambara A (n.d) . K-Means Clustering in R: Algorithm and Practical Examples
https://www.datanovia.com/en/lessons/k-means-clustering-in-r-algorith-and-practical-examples/
Long, A (n.d.), Local Moran
http://ceadserv1.nku.edu/longa//geomed/stats/localmoran/localmoran.html